ENH: disallow numpy generics #145

crusaderky · 2025-04-02T13:10:03Z

In #135, binary operators vs. numpy.int64 became (correctly) disallowed.
Add a test for it.
Also test binary operators vs. numpy.float64 and numpy.complex128, which must be ~~allowed as~~ rejected even if they are subclasses of float and complex respectively.

The same treatment for functions is in a separate PR: #148.

XREF TST/MAINT: signal.symiirorder2: r, omega, precision are floats; use default dtypes scipy/scipy#22783

ev-br · 2025-04-08T12:26:45Z

array_api_strict/tests/test_array_object.py

+)
+def test_binary_operators_vs_numpy_float(op):
+    """
+    np.float64 is a subclass of float and must be allowed.


Have to admit I'm not a fan of this. This is technically correct, it is a subclass. That said, it is an implementation detail IMO.

I'd think that all instances where numpy returns a numpy scalar, all other libraries return 0D arrays, so I'd think we better always handle them together with arrays not python scalars.

Are you saying that

x = xp.asarray(1, dtype=xp.float32) y = np.float64(2) z = xp.maximum(x, y)

maximum should crash instead of returning an Array[float32]?

Definitely not crash :-).

I think the following is correct[ because np.float64(2) is conceptually a np.asarray(2.) not python scalar 2.)

In [4]: xp = array_namespace(np.empty(2)) In [5]: xp.maximum(xp.asarray(1, dtype=xp.float32), 2) Out[5]: np.float32(2.0) In [6]: xp.maximum(xp.asarray(1, dtype=xp.float32), np.float64(2)) Out[6]: np.float64(2.0)

Likewise, the following is also correct, for the same reason:

In [12]: xpt = array_namespace(torch.empty(2)) In [13]: xpt.maximum(xpt.asarray(1, dtype=xpt.float32), np.float64(2)) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[13], line 1 ----> 1 xpt.maximum(xpt.asarray(1, dtype=xpt.float32), np.float64(2)) File ~/repos/array-api-compat/array_api_compat/torch/_aliases.py:91, in _two_arg.<locals>._f(x1, x2, **kwargs) 88 @_wraps(f) 89 def _f(x1, x2, /, **kwargs): 90 x1, x2 = _fix_promotion(x1, x2) ---> 91 return f(x1, x2, **kwargs) TypeError: maximum(): argument 'other' (position 2) must be Tensor, not numpy.float64

In [6]: xp.maximum(xp.asarray(1, dtype=xp.float32), np.float64(2))
Out[6]: np.float64(2.0)

This looks definitely wrong to me. array-api-strict shouldn't quietly accept a numpy array.
Right now the output is the expected one without explicitly disallowing inheritance:

>>> xp.maximum(xp.asarray(1, dtype=xp.float32), np.float64(2)) Array(2., dtype=array_api_strict.float32) >>> xp.maximum(xp.asarray(1, dtype=xp.float32), np.asarray(2, dtype=np.float64)) TypeError: Only real numeric dtypes are allowed in maximum(...). Got array_api_strict.float32 and float64. >>> xp.maximum(xp.asarray(1, dtype=xp.int32), np.int64(2)) TypeError: Only real numeric dtypes are allowed in maximum(...). Got array_api_strict.int32 and int64.

nit: the message of the TypeError is a bit misleading; it should call out explicitly something like "only Array and python scalars are accepted; got numpy.ndarray

This looks definitely wrong to me. array-api-strict shouldn't quietly accept a numpy array.
Right now the output is the expected one without explicitly disallowing inheritance:

Yes, in my example xp is array_api_compat.numpy. As your example shows, array-api-strict does the right thing in main, too. So what is this PR fixing?

ev-br

So this PR does two things:

adds a test for binary dunders reject arrays on different devices
adds tests which codify the behavior on binary dunders with numpy scalars.

The former is great IMO, the latter is less so. Essentially, I'd consider it a bug that np.float32(1.) behaves differently from np.float64(1.). Either they both should duck-type as the built-in float or they both should duck-type as a 0D array of the corresponding dtype.

To tell which behavior is better, consider how a numpy scalar might come from to array-api-strict. I see two possible scenarios:

a user explicitly creates an np.int64(python_int) or np.float32(python_float). Old code does that sometimes. IMO the user code should decide if they need a builtin object or a 0D array, and change accordingly.
some numpy operation (indexing or a ufunc, for instance) creates a numpy scalar. In those cases any other backend would have created a 0D array. Therefore, array-api-strict can have two possible behaviors: reject all numpy scalars or convert them into 0D arrays.

Rejecting numpy scalars is probably not very useful: it would require that all user code explicitly calls asarray just make user tests pass on both numpy and array-api-strict.
Converting them all sounds more useful for downstream testing, which is the main purpose of array-api-strict.

Therefore, I suggest that instead of codifying the existing bug, we add a test or several that all instances of np.generic duck-type with arrays in both binary functions and binary dunders, and actually make it happen: I think it will require something like if isinstance(x, np.generic): x = xp.asarray(x) before checking isinstance(x, bool | int| float | complex).

ev-br · 2025-04-22T13:22:44Z

array_api_strict/_array_object.py

-        else:
-            raise TypeError(f"Expected Array | python scalar; got {type(other)}")
+        elif not isinstance(other, bool | int | float | complex):
+            raise TypeError(f"Expected Array or Python scalar; got {type(other)}")


NOte to self: the changes to _check_device are just cosmetics plus a rename, the functionality is the same.

ev-br · 2025-04-22T14:08:58Z

array_api_strict/tests/test_array_object.py

+    "__rshift__": "integer",
+    "__sub__": "numeric",
+    "__truediv__": "floating-point",
+    "__xor__": "integer or boolean",


Note to self: the values here and in unary_op_types below correspond to dtype categories from _dtypes.py, so these two dicts are similar to those in https://github.com/data-apis/array-api-strict/blob/main/array_api_strict/tests/test_elementwise_functions.py#L36

ev-br · 2025-04-22T14:16:10Z

array_api_strict/tests/test_array_object.py

+    a = asarray(1)
+    i64 = np.int64(1)
+    with pytest.raises(TypeError, match="Expected Array or Python scalar"):
+        getattr(a, op)(i64)


Note to self: this test basically tests the status quo, the behavior is the same on main.

ev-br · 2025-04-22T14:33:13Z

array_api_strict/tests/test_array_object.py

+    for op in binary_op_dtypes:
+        assert isinstance(func(f64), Array)
+        with pytest.raises(TypeError, match="Expected Array or Python scalar"):
+            func(f32)


This test also reflects the status quo: the behavior is exactly the same on main.

ev-br · 2025-04-22T14:35:26Z

array_api_strict/tests/test_array_object.py

+    a = asarray(1, dtype=dtype, device=CPU_DEVICE)
+    b = asarray(1, dtype=dtype, device=Device("device1"))
+    with pytest.raises(ValueError, match="different devices"):
+        getattr(a, op)(b)


This test would be great to extend to binary functions, too. I'd expect them to work just the same, because they reuse the ._promote_scalar logic. If xp.add(a, b) differs from a.__add__(b), it's a bug, so would be great to test for.

It turns out extending to binary functions caused the scope blow up, so I moved it to a separate PR: #148

crusaderky · 2025-04-23T12:47:33Z

I've reworked this PR to disallow np.float64 and np.complex128 from being treated as float and complex in Array methods.
The same treatment for functions is in #148.
This is ready for a second round of review.

ev-br

LGTM now.

There is a non-zero chance that this causes issues downstream. However I believe that the change is correct and useful and it's downstream that should adjust to better align with the spec.
If push comes to shove, we'll temporarily revert. So let's do EAFP not LBYL and land this. Thanks @crusaderky

crusaderky force-pushed the np_generics branch from f059f0e to 94e9529 Compare April 7, 2025 09:45

ev-br reviewed Apr 8, 2025

View reviewed changes

TST: test binops vs. np.generics

b0e4df9

crusaderky force-pushed the np_generics branch from 94e9529 to b0e4df9 Compare April 17, 2025 10:26

ev-br requested changes Apr 22, 2025

View reviewed changes

Disallow float64 and complex128

e7fcd34

crusaderky mentioned this pull request Apr 23, 2025

ENH: unary functions overhaul; better input validation #148

Open

crusaderky changed the title ~~TST: test binary operators vs. numpy generics~~ ENH: disallow numpy generics Apr 23, 2025

ev-br approved these changes Apr 23, 2025

View reviewed changes

ev-br merged commit 17c7c40 into data-apis:main Apr 23, 2025
19 checks passed

ev-br added this to the 2.4 milestone Apr 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: disallow numpy generics #145

ENH: disallow numpy generics #145

crusaderky commented Apr 2, 2025 •

edited

Loading

ev-br Apr 8, 2025

crusaderky Apr 8, 2025 •

edited

Loading

ev-br Apr 8, 2025

crusaderky Apr 8, 2025

crusaderky Apr 8, 2025

ev-br Apr 8, 2025

ev-br left a comment

ev-br Apr 22, 2025

ev-br Apr 22, 2025

ev-br Apr 22, 2025

ev-br Apr 22, 2025

ev-br Apr 22, 2025

crusaderky Apr 23, 2025

crusaderky commented Apr 23, 2025

ev-br left a comment

ENH: disallow numpy generics #145

ENH: disallow numpy generics #145

Conversation

crusaderky commented Apr 2, 2025 • edited Loading

Choose a reason for hiding this comment

crusaderky Apr 8, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ev-br left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

crusaderky commented Apr 23, 2025

ev-br left a comment

Choose a reason for hiding this comment

crusaderky commented Apr 2, 2025 •

edited

Loading

crusaderky Apr 8, 2025 •

edited

Loading